Supervised acoustic topic model for unstructured audio information retrieval

نویسندگان

  • Samuel Kim
  • Panayiotis Georgiou
  • Shrikanth Narayanan
چکیده

We introduce a modified version of the acoustic topic model, which assumes an audio signal consists of latent acoustic topics and each topic can be interpreted as a distribution over acoustic words, for unstructured audio information retrieval applications. The proposed supervised acoustic topic model is based on supervised latent Dirichlet allocation (sLDA) while the conventional acoustic topic model is built upon latent Dirichlet allocation (LDA) which learns its parameters in an unsupervised manner. The experimental results with BBC Sound Effects Library indicate that the supervised acoustic model brings benefits in terms of classification accuracy by learning parameters with respect to corresponding categories of audio clips, i.e., semantic and onomatopoeic labels. Index Terms — audio information retrieval, acoustic topic model, unstructured audio, supervised LDA

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Latent acoustic topic models for unstructured audio classification

Samuel Kim, Panayiotis Georgiou and Shrikanth Narayanan APSIPA Transactions on Signal and Information Processing / Volume 1 / December 2012 / e6 DOI: 10.1017/ATSIP.2012.7, Published online: 10 December 2012 Link to this article: http://journals.cambridge.org/abstract_S2048770312000078 How to cite this article: Samuel Kim, Panayiotis Georgiou and Shrikanth Narayanan (2012). Latent acoustic topic...

متن کامل

Automatic Audio Tagging and Retrieval Using Semi-Surpervised Canonical Density Estimation

We apply SSCDE (semi-supervised canonical density estimation), a semi-supervised learning method based on topic modeling, to audio tagging and retrieval problems. SSCDE was originally proposed as an image annotaion and retireval method, but it can also be applied to audio data. The SSCDE method consists of two parts: 1) extraction of a low-dimentional latent space representing topics of sounds ...

متن کامل

Unsupervised topic adaptation for lecture speech retrieval

We are developing a cross-media information retrieval system, in which users can view specific segments of lecture videos by submitting text queries. To produce a text index, the audio track is extracted from a lecture video and a transcription is generated by automatic speech recognition. In this paper, to improve the quality of our retrieval system, we extensively investigate the effects of a...

متن کامل

A Study on Music Genre Recognition and Classification Techniques

Automatic classification of music genre is widely studied topic in music information retrieval (MIR) as it is an efficient method to structure and organize the large numbers of music files available on the Internet. Generally, the genre classification process of music has two main steps: feature extraction and classification. The first step obtains audio signal information, while the second one...

متن کامل

The Influence of Word Detection Variability on IR Performance in Automatic Audio Indexing of Course Lectures

This paper presents a study of the influence of acoustic variability on topic spotting performance in an application involving automatic indexing of course lectures. The application involves users formulating keyword queries to an indexing system which includes phone lattice based acoustic representations of audio material, a mechanism for keyword searching of a phone lattice, and a measure for...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010